Incremental ontology-based integration for translational medical research

نویسنده

  • Fabian Prasser
چکیده

Translational medical research is an emerging concept that aims at transforming discoveries from basic sciences into diagnostic and therapeutic applications. In the opposite direction, clinical data are needed for feedback and as stimuli for the generation of new research hypotheses. This process is highly data-intensive and centered around the idea of integrating data from basic biomedical sciences, clinical sciences and patient care. Therefore collaboration and information exchange is needed between previously separated domains, many of which are themselves fragmented. The complexity and heterogeneity of the involved data is constantly growing with increasing scientific progress and related biomedical structures and processes are subject to rapid change. For this reason, structured domain knowledge, e.g., from knowledge bases, is often required in order to adequately understand and interpret results. Furthermore, integration solutions have to be robust and flexible enough to handle changes in data and metadata. Security and privacy aspects are highly relevant and require the incorporation of complex access control mechanisms as well as concepts for data anonymization and pseudonymization. In this thesis, first an ontology-based methodology for integrating heterogeneous biomedical datasets in a distributed environment is proposed. It advocates an incremental approach that builds upon data coexistence and aims at carrying out semantic integration in a demand oriented and flexible manner. The importance of structured domain knowledge is addressed by blurring the boundaries between primary data and metadata. Data federation allows researchers to maintain control over their local datasets and is also utilized to model a fine-grained access control mechanism. Robustness is achieved by designing the system as a set of loosely coupled components, which can be added, altered and removed independently of each other. Second, an implementation based on a large distributed graph of uniquely identifiable nodes is presented. The groundwork is laid by novel techniques for mapping biomedical data sources into the graph. As all further components require an integrated access to the global data, several compile-time and and run-time optimization techniques for the efficient distributed execution of queries are presented. Manual semantic integration is supported by concepts for browsing, annotating and mapping data items. Automated semantic integration and data transformation is supported via a flexible workflow engine, which builds upon the querying interface. Here, result sets can be post-processed with a scripting language that provides domain-specific operators, such as semantic reasoning and data anonymization. The resulting transformed data can be re-integrated into the graph. Finally, a prototypical implementation is presented which integrates the individual components into a comprehensive data integration solution. It provides a querying interface for applications and allows to administer the data space via a unified graphical user interface for data integrators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Translational integrity and continuity: Personalized biomedical data integration

Translational research data are generated in multiple research domains from the bedside to experimental laboratories. These data are typically stored in heterogeneous databases, held by segregated research domains, and described with inconsistent terminologies. Such inconsistency and fragmentation of data significantly impedes the efficiency of tracking and analyzing human-centered records. To ...

متن کامل

A Semantic Image Annotation Model to Enable Integrative Translational Research

Integrating and relating images with clinical and molecular data is a crucial activity in translational research, but challenging because the information in images is not explicit in standard computer-accessible formats. We have developed an ontology-based representation of the semantic contents of radiology images called AIM (Annotation and Image Markup). AIM specifies the quantitative and qua...

متن کامل

Capturing phenotypes for precision medicine

Deep phenotyping followed by integrated computational analysis of genotype and phenotype is becoming ever more important for many areas of genomic diagnostics and translational research. The overwhelming majority of clinical descriptions in the medical literature are available only as natural language text, meaning that searching, analysis, and integration of medically relevant information in d...

متن کامل

An Incremental and Iterative Process for Ontology Building

The ontology development area has received some attention over the years. Methodologies focusing in diverse aspects of ontology development have emerged. Some of these methodologies are consolidated, presenting phases and activities. However, existing methodologies do not fully consider the ontology integration process. Therefore, based on METHONTOLOGY and a methodology for integrating ontologi...

متن کامل

ONSTR: The Ontology for Newborn Screening Follow-up and Translational Research

Translational research in the field of newborn screening system requires integration of data generated during various phases of life long treatment of patients identified and diagnosed through newborn dried blood spot screening (NDBS). In this paper, we describe the O...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013